Forensic STR allele extraction using a machine learning paradigm.
Identifieur interne : 000197 ( Main/Exploration ); précédent : 000196; suivant : 000198Forensic STR allele extraction using a machine learning paradigm.
Auteurs : Yao-Yuan Liu [Nouvelle-Zélande] ; David Welch [Nouvelle-Zélande] ; Ryan England [Nouvelle-Zélande] ; Janet Stacey [Nouvelle-Zélande] ; Sallyann Harbison [Nouvelle-Zélande]Source :
- Forensic science international. Genetics [ 1878-0326 ] ; 2020.
Abstract
We present a machine learning approach to short tandem repeat (STR) sequence detection and extraction from massively parallel sequencing data called Fragsifier. Using this approach, STRs are detected on each read by first locating the longest repeat stretches followed by locus prediction using k-mers in a machine learning sequence model. This is followed by reference flanking sequence alignment to determine precise STR boundaries. We show that Fragsifier produces genotypes that are concordant with profiles obtained using capillary electrophoresis (CE), and also compared the results with that of STRait Razor and the ForenSeq UAS. The data pre-processing and training of the sequence classifier is readily scripted, allowing the analyst to experiment with different thresholds, datasets and loci of interest, and different machine learning models.
DOI: 10.1016/j.fsigen.2019.102194
PubMed: 31698330
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream PubMed, to step Corpus: 000372
- to stream PubMed, to step Curation: 000372
- to stream PubMed, to step Checkpoint: 000200
- to stream Ncbi, to step Merge: 002392
- to stream Ncbi, to step Curation: 002392
- to stream Ncbi, to step Checkpoint: 002392
- to stream Main, to step Merge: 000200
- to stream Main, to step Curation: 000197
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Forensic STR allele extraction using a machine learning paradigm.</title>
<author><name sortKey="Liu, Yao Yuan" sort="Liu, Yao Yuan" uniqKey="Liu Y" first="Yao-Yuan" last="Liu">Yao-Yuan Liu</name>
<affiliation wicri:level="1"><nlm:affiliation>Forensic Science Program, School of Chemical Sciences, University of Auckland, 38 Princes Street, Auckland 1010, New Zealand.</nlm:affiliation>
<country xml:lang="fr">Nouvelle-Zélande</country>
<wicri:regionArea>Forensic Science Program, School of Chemical Sciences, University of Auckland, 38 Princes Street, Auckland 1010</wicri:regionArea>
<wicri:noRegion>Auckland 1010</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Welch, David" sort="Welch, David" uniqKey="Welch D" first="David" last="Welch">David Welch</name>
<affiliation wicri:level="1"><nlm:affiliation>School of Computer Science, University of Auckland, 38 Princes Street, Auckland 1010, New Zealand.</nlm:affiliation>
<country xml:lang="fr">Nouvelle-Zélande</country>
<wicri:regionArea>School of Computer Science, University of Auckland, 38 Princes Street, Auckland 1010</wicri:regionArea>
<wicri:noRegion>Auckland 1010</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="England, Ryan" sort="England, Ryan" uniqKey="England R" first="Ryan" last="England">Ryan England</name>
<affiliation wicri:level="1"><nlm:affiliation>Forensic Science Program, School of Chemical Sciences, University of Auckland, 38 Princes Street, Auckland 1010, New Zealand; Institute of Environmental Science and Research Limited, Private Bag 92021, Auckland 1142, New Zealand.</nlm:affiliation>
<country xml:lang="fr">Nouvelle-Zélande</country>
<wicri:regionArea>Forensic Science Program, School of Chemical Sciences, University of Auckland, 38 Princes Street, Auckland 1010, New Zealand; Institute of Environmental Science and Research Limited, Private Bag 92021, Auckland 1142</wicri:regionArea>
<wicri:noRegion>Auckland 1142</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Stacey, Janet" sort="Stacey, Janet" uniqKey="Stacey J" first="Janet" last="Stacey">Janet Stacey</name>
<affiliation wicri:level="1"><nlm:affiliation>Institute of Environmental Science and Research Limited, Private Bag 92021, Auckland 1142, New Zealand.</nlm:affiliation>
<country xml:lang="fr">Nouvelle-Zélande</country>
<wicri:regionArea>Institute of Environmental Science and Research Limited, Private Bag 92021, Auckland 1142</wicri:regionArea>
<wicri:noRegion>Auckland 1142</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Harbison, Sallyann" sort="Harbison, Sallyann" uniqKey="Harbison S" first="Sallyann" last="Harbison">Sallyann Harbison</name>
<affiliation wicri:level="1"><nlm:affiliation>Institute of Environmental Science and Research Limited, Private Bag 92021, Auckland 1142, New Zealand. Electronic address: sallyann.harbison@esr.cri.nz.</nlm:affiliation>
<country xml:lang="fr">Nouvelle-Zélande</country>
<wicri:regionArea>Institute of Environmental Science and Research Limited, Private Bag 92021, Auckland 1142</wicri:regionArea>
<wicri:noRegion>Auckland 1142</wicri:noRegion>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PubMed</idno>
<date when="2020">2020</date>
<idno type="RBID">pubmed:31698330</idno>
<idno type="pmid">31698330</idno>
<idno type="doi">10.1016/j.fsigen.2019.102194</idno>
<idno type="wicri:Area/PubMed/Corpus">000372</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">000372</idno>
<idno type="wicri:Area/PubMed/Curation">000372</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">000372</idno>
<idno type="wicri:Area/PubMed/Checkpoint">000200</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">000200</idno>
<idno type="wicri:Area/Ncbi/Merge">002392</idno>
<idno type="wicri:Area/Ncbi/Curation">002392</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">002392</idno>
<idno type="wicri:Area/Main/Merge">000200</idno>
<idno type="wicri:Area/Main/Curation">000197</idno>
<idno type="wicri:Area/Main/Exploration">000197</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en">Forensic STR allele extraction using a machine learning paradigm.</title>
<author><name sortKey="Liu, Yao Yuan" sort="Liu, Yao Yuan" uniqKey="Liu Y" first="Yao-Yuan" last="Liu">Yao-Yuan Liu</name>
<affiliation wicri:level="1"><nlm:affiliation>Forensic Science Program, School of Chemical Sciences, University of Auckland, 38 Princes Street, Auckland 1010, New Zealand.</nlm:affiliation>
<country xml:lang="fr">Nouvelle-Zélande</country>
<wicri:regionArea>Forensic Science Program, School of Chemical Sciences, University of Auckland, 38 Princes Street, Auckland 1010</wicri:regionArea>
<wicri:noRegion>Auckland 1010</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Welch, David" sort="Welch, David" uniqKey="Welch D" first="David" last="Welch">David Welch</name>
<affiliation wicri:level="1"><nlm:affiliation>School of Computer Science, University of Auckland, 38 Princes Street, Auckland 1010, New Zealand.</nlm:affiliation>
<country xml:lang="fr">Nouvelle-Zélande</country>
<wicri:regionArea>School of Computer Science, University of Auckland, 38 Princes Street, Auckland 1010</wicri:regionArea>
<wicri:noRegion>Auckland 1010</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="England, Ryan" sort="England, Ryan" uniqKey="England R" first="Ryan" last="England">Ryan England</name>
<affiliation wicri:level="1"><nlm:affiliation>Forensic Science Program, School of Chemical Sciences, University of Auckland, 38 Princes Street, Auckland 1010, New Zealand; Institute of Environmental Science and Research Limited, Private Bag 92021, Auckland 1142, New Zealand.</nlm:affiliation>
<country xml:lang="fr">Nouvelle-Zélande</country>
<wicri:regionArea>Forensic Science Program, School of Chemical Sciences, University of Auckland, 38 Princes Street, Auckland 1010, New Zealand; Institute of Environmental Science and Research Limited, Private Bag 92021, Auckland 1142</wicri:regionArea>
<wicri:noRegion>Auckland 1142</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Stacey, Janet" sort="Stacey, Janet" uniqKey="Stacey J" first="Janet" last="Stacey">Janet Stacey</name>
<affiliation wicri:level="1"><nlm:affiliation>Institute of Environmental Science and Research Limited, Private Bag 92021, Auckland 1142, New Zealand.</nlm:affiliation>
<country xml:lang="fr">Nouvelle-Zélande</country>
<wicri:regionArea>Institute of Environmental Science and Research Limited, Private Bag 92021, Auckland 1142</wicri:regionArea>
<wicri:noRegion>Auckland 1142</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Harbison, Sallyann" sort="Harbison, Sallyann" uniqKey="Harbison S" first="Sallyann" last="Harbison">Sallyann Harbison</name>
<affiliation wicri:level="1"><nlm:affiliation>Institute of Environmental Science and Research Limited, Private Bag 92021, Auckland 1142, New Zealand. Electronic address: sallyann.harbison@esr.cri.nz.</nlm:affiliation>
<country xml:lang="fr">Nouvelle-Zélande</country>
<wicri:regionArea>Institute of Environmental Science and Research Limited, Private Bag 92021, Auckland 1142</wicri:regionArea>
<wicri:noRegion>Auckland 1142</wicri:noRegion>
</affiliation>
</author>
</analytic>
<series><title level="j">Forensic science international. Genetics</title>
<idno type="eISSN">1878-0326</idno>
<imprint><date when="2020" type="published">2020</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">We present a machine learning approach to short tandem repeat (STR) sequence detection and extraction from massively parallel sequencing data called Fragsifier. Using this approach, STRs are detected on each read by first locating the longest repeat stretches followed by locus prediction using k-mers in a machine learning sequence model. This is followed by reference flanking sequence alignment to determine precise STR boundaries. We show that Fragsifier produces genotypes that are concordant with profiles obtained using capillary electrophoresis (CE), and also compared the results with that of STRait Razor and the ForenSeq UAS. The data pre-processing and training of the sequence classifier is readily scripted, allowing the analyst to experiment with different thresholds, datasets and loci of interest, and different machine learning models.</div>
</front>
</TEI>
<affiliations><list><country><li>Nouvelle-Zélande</li>
</country>
</list>
<tree><country name="Nouvelle-Zélande"><noRegion><name sortKey="Liu, Yao Yuan" sort="Liu, Yao Yuan" uniqKey="Liu Y" first="Yao-Yuan" last="Liu">Yao-Yuan Liu</name>
</noRegion>
<name sortKey="England, Ryan" sort="England, Ryan" uniqKey="England R" first="Ryan" last="England">Ryan England</name>
<name sortKey="Harbison, Sallyann" sort="Harbison, Sallyann" uniqKey="Harbison S" first="Sallyann" last="Harbison">Sallyann Harbison</name>
<name sortKey="Stacey, Janet" sort="Stacey, Janet" uniqKey="Stacey J" first="Janet" last="Stacey">Janet Stacey</name>
<name sortKey="Welch, David" sort="Welch, David" uniqKey="Welch D" first="David" last="Welch">David Welch</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000197 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000197 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Sante |area= MersV1 |flux= Main |étape= Exploration |type= RBID |clé= pubmed:31698330 |texte= Forensic STR allele extraction using a machine learning paradigm. }}
Pour générer des pages wiki
HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Exploration/RBID.i -Sk "pubmed:31698330" \ | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd \ | NlmPubMed2Wicri -a MersV1
This area was generated with Dilib version V0.6.33. |